Incremental Mixture Learning for Clustering Discrete Data
نویسندگان
چکیده
This paper elaborates on an efficient approach for clustering discrete data by incrementally building multinomial mixture models through likelihood maximization using the Expectation-Maximization (EM) algorithm. The method adds sequentially at each step a new multinomial component to a mixture model based on a combined scheme of global and local search in order to deal with the initialization problem of the EM algorithm. In the global search phase several initial values are examined for the parameters of the multinomial component. These values are selected from an appropriately defined set of initialization candidates. Two methods are proposed here to specify the elements of this set based on the agglomerative and the kd-tree clustering algorithms. We investigate the performance of the incremental learning technique on a synthetic and a real dataset and also provide comparative results with the standard EM-based multinomial mixture model.
منابع مشابه
A Hybrid Framework for Building an Efficient Incremental Intrusion Detection System
In this paper, a boosting-based incremental hybrid intrusion detection system is introduced. This system combines incremental misuse detection and incremental anomaly detection. We use boosting ensemble of weak classifiers to implement misuse intrusion detection system. It can identify new classes types of intrusions that do not exist in the training dataset for incremental misuse detection. As...
متن کاملIncremental Learning of Multivariate Gaussian Mixture Models
This paper presents a new algorithm for unsupervised incremental learning based on a Bayesian framework. The algorithm, called IGMM (for Incremental Gaussian Mixture Model), creates and continually adjusts a Gaussian Mixture Model consistent to all sequentially presented data. IGMM is particularly useful for on-line incremental clustering of data streams, as encountered in the domain of mobile ...
متن کاملIncremental Learning of Gaussian Mixture Models
Gaussian Mixture Modeling (GMM) is a parametric method for high dimensional density estimation. Incremental learning of GMM is very important in problems such as clustering of streaming data and robot localization in dynamic environments. Traditional GMM estimation algorithms like EM Clustering tend to be computationally very intensive in these scenarios. We present an incremental GMM estimatio...
متن کاملIncremental Learning of Nonparametric Bayesian Mixture Models: Extended Thesis Chapter
Clustering is a fundamental task in many vision applications. To date, most clustering algorithms work in a batch setting and training examples must be gathered in a large group before learning can begin. Here we explore incremental clustering, in which data can arrive continuously. We present a novel incremental model-based clustering algorithm based on nonparametric Bayesian methods, which we...
متن کاملVisual Scenes Clustering Using Variational Incremental Learning of Infinite Generalized Dirichlet Mixture Models
In this paper, we develop a clustering approach based on variational incremental learning of a Dirichlet process of generalized Dirichlet (GD) distributions. Our approach is built on nonparametric Bayesian analysis where the determination of the complexity of the mixture model (i.e. the number of components) is sidestepped by assuming an infinite number of mixture components. By leveraging an i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004